Escaping Saddles with Stochastic Gradients

نویسندگان

  • Hadi Daneshmand
  • Jonas Kohler
  • Aurelien Lucchi
  • Thomas Hofmann
چکیده

We analyze the variance of stochastic gradients along negative curvature directions in certain nonconvex machine learning models and show that stochastic gradients exhibit a strong component along these directions. Furthermore, we show that contrary to the case of isotropic noise this variance is proportional to the magnitude of the corresponding eigenvalues and not decreasing in the dimensionality. Based upon this observation we propose a new assumption under which we show that the injection of explicit, isotropic noise usually applied to make gradient descent escape saddle points can successfully be replaced by a simple SGD step. Additionally and under the same condition we derive the first convergence rate for plain SGD to a second-order stationary point in a number of iterations that is independent of the problem dimension.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simultaneous estimation of noise variance and number of peaks in Bayesian spectral deconvolution

Heuristic identification of peaks from noisy complex spectra often leads to misunderstanding physical and chemical properties of matter. In this paper, we propose a framework based on Bayesian inference, which enables us to separate multi-peak spectra into single peaks statistically and is constructed in two steps. The first step is estimating both noise variance and number of peaks as hyperpar...

متن کامل

Smoothed Gradients for Stochastic Variational Inference

Stochastic variational inference (SVI) lets us scale up Bayesian computation to massive data. It uses stochastic optimization to fit a variational distribution, following easy-to-compute noisy natural gradients. As with most traditional stochastic optimization methods, SVI takes precautions to use unbiased stochastic gradients whose expectations are equal to the true gradients. In this paper, w...

متن کامل

Computational aspects of Shapley's saddles

Game-theoretic solution concepts, such as Nash equilibrium, are playing an ever increasing role in the study of systems of autonomous computational agents. A common criticism of Nash equilibrium is that its existence relies on the possibility of randomizing over actions, which in many cases is deemed unsuitable, impractical, or even infeasible. In work dating back to the early 1950s Lloyd Shapl...

متن کامل

Stochastic multiresonance due to interplay between noise and fractals.

Stochastic multiresonance is shown to occur in a general class of threshold-crossing systems, in which a derivative of the threshold-crossing probability with respect to a system parameter is a nonmonotonic function of the noise intensity. As an example, a two-dimensional chaotic map is considered, where the threshold-crossing probability follows the overlap of the fractal structures of chaotic...

متن کامل

Quasi-saddles of Liquids: Computational Study of a Bulk Lennard-jones System

Quasi-saddles or inherent saddles of the potential energy surface, U , of a liquid are defined as configurations which correspond to absolute minima of the pseudo-potential surface, W = |∇U |, as identified by a multi-dimensional minimisation procedure. The sensitivity of statistical properties of inherent saddles to the convergence criteria of the minimisation procedure is investigated using, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018